Bounded similarity querying for time-series data

نویسندگان

  • Dina Q. Goldin
  • Todd D. Millstein
  • Ayferi Kutlu
چکیده

We de ne the problem of bounded similarity querying in time-series databases, which generalizes earlier notions of similarity querying. Given a (sub)sequence S, a query sequence Q, lower and upper bounds on shifting and scaling parameters, and a tolerance , S is considered boundedly similar to Q if S can be shifted and scaled within the speci ed bounds to produce a modi ed sequence S whose distance from Q is within . We use similarity transformation to formalize the notion of bounded similarity. We then describe a framework that supports the resulting set of queries; it is based on a ngerprint method that normalizes the data and saves the normalization parameters. For o -line data, we provide an indexing method with a single index structure and search technique for handling all the special cases of bounded similarity querying. Experimental investigations nd the performance of our method to be competitive with earlier, less general approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

Querying and mining of time series data: experimental comparison of representations and distance measures

The last decade has witnessed a tremendous growths of interests in applications that deal with querying and mining of time series data. Numerous representation methods for dimensionality reduction and similarity measures geared towards time series have been introduced. Each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justif...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

Landmarks: a New Model for Similarity-based Pattern Querying in Time Series Databases

In this paper we present the Landmark Model, a model for time series that yields new techniques for similarity-based time series pattern querying. The Landmark Model does not follow traditional similarity models that rely on pointwise Euclidean distance. Instead, it leads to Landmark Similarity, a general model of similarity that is consistent with human intuition and episodic memory. By tracki...

متن کامل

A Novel methodology for Searching Dimension Incomplete Database

This manuscript deals with the similarity querying problems for cases where data loss exists. Limitations in traditional methodologies for querying incomplete data in database, data mining and information retrieval research has urged to shift into development of different new innovative models. This Investigation is done based on a model developed based on ARIMA constructional model to check th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Comput.

دوره 194  شماره 

صفحات  -

تاریخ انتشار 2004